Logistic/Probit regression
linear probability model#
E(y∣x)=x′β
Probit & Logit#
P(y=1∣x)=F(x′β)P(y=0∣x)=1−F(x′β) - F(x)=Φ(x) Probit for endogeneity.
- F(x)=Logit(x) Logit for computing simplicity.
Logistic regression#
Logistic distribution CDF is in the form of a logistic function.
F(x)=1+e−(x−μ)/s1
Logistic regression assumption:
P(Y=1∣X)=1+exp(w0+∑i=1nwiXi)1 latent-score-interpretation
y=1iff y∗=x′w+ϵ>0.
ϵ∼Logit.
P(y=1∣x)=P(y∗>0∣x)=P(ϵ<x′w∣x)=Fϵ(x′w) marginal effect#
Odd=P(y=0∣x)P(y=1∣x)=ex′w
w represents the marginal effect of x on the log of the odd.
MLE#
W^=Wargmaxl∑lnP(Yl∣Xl,W) log-loss function#
From the view of machine learning the mle object function could be represented by log-loss cost function with further assumption on penalty terms.
W^=Wargmin−Cl∑lnP(Yl∣Xl,W)+∥W∥=Wargmin−Cl∑Ylln(P(Yl))+(1−Yl)ln(1−P(Yl))+∥W∥ Probit with endogeneous#
from the latent score interpretation,
y1=1y1∗>0
structure model (u,v) are bivariate normal, var(u)=1,
y1∗=z1′δ1+α1y2+u;y2=z2′δ2+v;u=θv+e y2 correlated with u via v.
We have
var(e)=1−var(v)var(u)cov(v,u)2=1−ρ2
By GLS e~∼N(0,1);
y1∗=z1′δ1+y2α1+vθ+ey1∗/(1−ρ2)=z1′δ1/(1−ρ2)+y2α1/(1−ρ2)+vθ/(1−ρ2)+e/(1−ρ2)y~1∗=z1′δ~1+y2α~1+vθ~+e~ Then the probit model reduced to:
P(y=1∣z1,y2,v)=P(y~∗>0∣z1,y2,v)=Φ(z1′δ~1+α~y2+θ~v)
2-step
- estimate v
- run probit using v^
Probit with fixed effects#
yit∗=xit′β+αi+ε